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        Use Cases and Operational Experience with Multipath TCP

Abstract

   This document discusses both use cases and operational experience
   with Multipath TCP (MPTCP) in real networks.  It lists several
   prominent use cases where Multipath TCP has been considered and is
   being used.  It also gives insight to some heuristics and decisions
   that have helped to realize these use cases and suggests possible
   improvements.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Not all documents
   approved by the IESG are a candidate for any level of Internet
   Standard; see Section 2 of RFC 7841.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc8041.
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1.  Introduction

   Multipath TCP was specified in [RFC6824] and five independent
   implementations have been developed.  As of November 2016, Multipath
   TCP has been or is being implemented on the following platforms:

   o  Linux kernel [MultipathTCP-Linux]

   o  Apple iOS and macOS

   o  Citrix load balancers

   o  FreeBSD [FreeBSD-MPTCP]

   o  Oracle Solaris

   The first three implementations are known to interoperate.  Three of
   these implementations are open source (Linux kernel, FreeBSD and
   Apple's iOS and macOS).  Apple's implementation is widely deployed.

   Since the publication of [RFC6824] as an Experimental RFC, experience
   has been gathered by various network researchers and users about the
   operational issues that arise when Multipath TCP is used in today's
   Internet.

   When the MPTCP working group was created, several use cases for
   Multipath TCP were identified [RFC6182].  Since then, other use cases
   have been proposed and some have been tested and even deployed.  We
   describe these use cases in Section 2.

   Section 3 focuses on the operational experience with Multipath TCP.
   Most of this experience comes from the utilization of the Multipath
   TCP implementation in the Linux kernel [MultipathTCP-Linux].  This
   open-source implementation has been downloaded and implemented by
   thousands of users all over the world.  Many of these users have
   provided direct or indirect feedback by writing documents (scientific
   articles or blog messages) or posting to the mptcp-dev mailing list
   (see https://listes-2.sipr.ucl.ac.be/sympa/arc/mptcp-dev).  This
   Multipath TCP implementation is actively maintained and continuously
   improved.  It is used on various types of hosts, ranging from
   smartphones or embedded routers to high-end servers.

   The Multipath TCP implementation in the Linux kernel is not, by far,
   the most widespread deployment of Multipath TCP.  Since September
   2013, Multipath TCP is also supported on smartphones and tablets
   beginning with iOS7 [IETFJ].  There are likely hundreds of millions
   of MPTCP-enabled devices.  This Multipath TCP implementation is
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   currently only used to support the Siri voice recognition/control
   application.  Some lessons learned from this deployment are described
   in [IETFJ].

   Section 3 is organized as follows.  Supporting the middleboxes was
   one of the difficult issues in designing the Multipath TCP protocol.
   We explain in Section 3.1 which types of middleboxes the Linux Kernel
   implementation of Multipath TCP supports and how it reacts upon
   encountering these.  Section 3.2 summarizes the MPTCP-specific
   congestion controls that have been implemented.  Sections 3.3 to 3.7
   discuss heuristics and issues with respect to subflow management as
   well as the scheduling across the subflows.  Section 3.8 explains
   some problems that occurred with subflows having different Maximum
   Segment Size (MSS) values.  Section 3.9 presents issues with respect
   to content delivery networks and suggests a solution to this issue.
   Finally, Section 3.10 documents an issue with captive portals where
   MPTCP will behave suboptimally.

2.  Use Cases

   Multipath TCP has been tested in several use cases.  There is already
   an abundant amount of scientific literature on Multipath TCP
   [MPTCPBIB].  Several of the papers published in the scientific
   literature have identified possible improvements that are worth being
   discussed here.

2.1.  Datacenters

   A first, although initially unexpected, documented use case for
   Multipath TCP has been in datacenters [HotNets][SIGCOMM11].  Today's
   datacenters are designed to provide several paths between single-
   homed servers.  The multiplicity of these paths comes from the
   utilization of Equal-Cost Multipath (ECMP) and other load-balancing
   techniques inside the datacenter.  Most of the deployed load-
   balancing techniques in datacenters rely on hashes computed over the
   five tuple.  Thus, all packets from the same TCP connection follow
   the same path: so they are not reordered.  The results in [HotNets]
   demonstrate by simulations that Multipath TCP can achieve a better
   utilization of the available network by using multiple subflows for
   each Multipath TCP session.  Although [RFC6182] assumes that at least
   one of the communicating hosts has several IP addresses, [HotNets]
   demonstrates that Multipath TCP is beneficial when both hosts are
   single-homed.  This idea is analyzed in more details in [SIGCOMM11],
   where the Multipath TCP implementation in the Linux kernel is
   modified to be able to use several subflows from the same IP address.
   Measurements in a public datacenter show the quantitative benefits of
   Multipath TCP [SIGCOMM11] in this environment.
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   Although ECMP is widely used inside datacenters, this is not the only
   environment where there are different paths between a pair of hosts.
   ECMP and other load-balancing techniques such as Link Aggregation
   Groups (LAGs) are widely used in today's networks; having multiple
   paths between a pair of single-homed hosts is becoming the norm
   instead of the exception.  Although these multiple paths often have
   the same cost (from an IGP metrics viewpoint), they do not
   necessarily have the same performance.  For example, [IMC13c] reports
   the results of a long measurement study showing that load-balanced
   Internet paths between that same pair of hosts can have huge delay
   differences.

2.2.  Cellular/WiFi Offload

   A second use case that has been explored by several network
   researchers is the cellular/WiFi offload use case.  Smartphones or
   other mobile devices equipped with two wireless interfaces are a very
   common use case for Multipath TCP.  In September 2015, this is also
   the largest deployment of MPTCP-enabled devices [IETFJ].  It has been
   briefly discussed during IETF 88 [IETF88], but there is no published
   paper or report that analyses this deployment.  For this reason, we
   only discuss published papers that have mainly used the Multipath TCP
   implementation in the Linux kernel for their experiments.

   The performance of Multipath TCP in wireless networks was briefly
   evaluated in [NSDI12].  One experiment analyzes the performance of
   Multipath TCP on a client with two wireless interfaces.  This
   evaluation shows that when the receive window is large, Multipath TCP
   can efficiently use the two available links.  However, if the window
   becomes smaller, then packets sent on a slow path can block the
   transmission of packets on a faster path.  In some cases, the
   performance of Multipath TCP over two paths can become lower than the
   performance of regular TCP over the best performing path.  Two
   heuristics, reinjection and penalization, are proposed in [NSDI12] to
   solve this identified performance problem.  These two heuristics have
   since been used in the Multipath TCP implementation in the Linux
   kernel.  [CONEXT13] explored the problem in more detail and revealed
   some other scenarios where Multipath TCP can have difficulties in
   efficiently pooling the available paths.  Improvements to the
   Multipath TCP implementation in the Linux kernel are proposed in
   [CONEXT13] to cope with some of these problems.

   The first experimental analysis of Multipath TCP in a public wireless
   environment was presented in [Cellnet12].  These measurements explore
   the ability of Multipath TCP to use two wireless networks (real WiFi
   and 3G networks).  Three modes of operation are compared.  The first
   mode of operation is the simultaneous use of the two wireless
   networks.  In this mode, Multipath TCP pools the available resources
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   and uses both wireless interfaces.  This mode provides fast handover
   from WiFi to cellular or the opposite when the user moves.
   Measurements presented in [CACM14] show that the handover from one
   wireless network to another is not an abrupt process.  When a host
   moves, there are regions where the quality of one of the wireless
   networks is weaker than the other, but the host considers this
   wireless network to still be up.  When a mobile host enters such
   regions, its ability to send packets over another wireless network is
   important to ensure a smooth handover.  This is clearly illustrated
   from the packet trace discussed in [CACM14].

   Many cellular networks use volume-based pricing; users often prefer
   to use unmetered WiFi networks when available instead of metered
   cellular networks.  [Cellnet12] implements support for the MP_PRIO
   option to explore two other modes of operation.

   In the backup mode, Multipath TCP opens a TCP subflow over each
   interface, but the cellular interface is configured in backup mode.
   This implies that data flows only over the WiFi interface when both
   interfaces are considered to be active.  If the WiFi interface fails,
   then the traffic switches quickly to the cellular interface, ensuring
   a smooth handover from the user's viewpoint [Cellnet12].  The cost of
   this approach is that the WiFi and cellular interfaces are likely to
   remain active all the time since all subflows are established over
   the two interfaces.

   The single-path mode is slightly different.  This mode benefits from
   the break-before-make capability of Multipath TCP.  When an MPTCP
   session is established, a subflow is created over the WiFi interface.
   No packet is sent over the cellular interface as long as the WiFi
   interface remains up [Cellnet12].  This implies that the cellular
   interface can remain idle and battery capacity is preserved.  When
   the WiFi interface fails, a new subflow is established over the
   cellular interface in order to preserve the established Multipath TCP
   sessions.  Compared to the backup mode described earlier,
   measurements reported in [Cellnet12] indicate that this mode of
   operation is characterized by a throughput drop while the cellular
   interface is brought up and the subflows are reestablished.

   From a protocol viewpoint, [Cellnet12] discusses the problem posed by
   the unreliability of the REMOVE_ADDR option and proposes a small
   protocol extension to allow hosts to reliably exchange this option.
   It would be useful to analyze packet traces to understand whether the
   unreliability of the REMOVE_ADDR option poses an operational problem
   in real deployments.
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   Another study of the performance of Multipath TCP in wireless
   networks was reported in [IMC13b].  This study uses laptops connected
   to various cellular ISPs and WiFi hotspots.  It compares various file
   transfer scenarios.  [IMC13b] observes that 4-path MPTCP outperforms
   2-path MPTCP, especially for larger files.  However, for three
   congestion-control algorithms (LIA, OLIA, and Reno -- see
   Section 3.2), there is no significant performance difference for file
   sizes smaller than 4 MB.

   A different study of the performance of Multipath TCP with two
   wireless networks is presented in [INFOCOM14].  In this study the two
   networks had different qualities: a good network and a lossy network.
   When using two paths with different packet-loss ratios, the Multipath
   TCP congestion-control scheme moves traffic away from the lossy link
   that is considered to be congested.  However, [INFOCOM14] documents
   an interesting scenario that is summarized hereafter.

   client ----------- path1 -------- server
     |                                  |
     +--------------- path2 ------------+

       Figure 1: Simple network topology

   Initially, the two paths in Figure 1 have the same quality and
   Multipath TCP distributes the load over both of them.  During the
   transfer, the path2 becomes lossy, e.g., because the client moves.
   Multipath TCP detects the packet losses and they are retransmitted
   over path1.  This enables the data transfer to continue over this
   path.  However, the subflow over path2 is still up and transmits one
   packet from time to time.  Although the N packets have been
   acknowledged over the first subflow (at the MPTCP level), they have
   not been acknowledged at the TCP level over the second subflow.  To
   preserve the continuity of the sequence numbers over the second
   subflow, TCP will continue to retransmit these segments until either
   they are acknowledged or the maximum number of retransmissions is
   reached.  This behavior is clearly inefficient and may lead to
   blocking since the second subflow will consume window space to be
   able to retransmit these packets.  [INFOCOM14] proposes a new
   Multipath TCP option to solve this problem.  In practice, a new TCP
   option is probably not required.  When the client detects that the
   data transmitted over the second subflow has been acknowledged over
   the first subflow, it could decide to terminate the second subflow by
   sending a RST segment.  If the interface associated to this subflow
   is still up, a new subflow could be immediately reestablished.  It
   would then be immediately usable to send new data and would not be
   forced to first retransmit the previously transmitted data.  As of
   this writing, this dynamic management of the subflows is not yet
   implemented in the Multipath TCP implementation in the Linux kernel.
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   Some studies have started to analyze the performance of Multipath TCP
   on smartphones with real applications.  In contrast with the bulk
   transfers that are used by many publications, many deployed
   applications do not exchange huge amounts of data and mainly use
   small connections.  [COMMAG2016] proposes a software testing
   framework that allows to automate Android applications to study their
   interactions with Multipath TCP.  [PAM2016] analyses a one-month
   packet trace of all the packets exchanged by a dozen of smartphones
   utilized by regular users.  This analysis reveals that short
   connections are important on smartphones and that the main benefit of
   using Multipath TCP on smartphones is the ability to perform seamless
   handovers between different wireless networks.  Long connections
   benefit from these handovers.

2.3.  Multipath TCP Proxies

   As Multipath TCP is not yet widely deployed on both clients and
   servers, several deployments have used various forms of proxies.  Two
   families of solutions are currently being used or tested.

   A first use case is when an MPTCP-enabled client wants to use several
   interfaces to reach a regular TCP server.  A typical use case is a
   smartphone that needs to use both its WiFi and its cellular interface
   to transfer data.  Several types of proxies are possible for this use
   case.  An HTTP proxy deployed on a MPTCP-capable server would enable
   the smartphone to use Multipath TCP to access regular web servers.
   Obviously, this solution only works for applications that rely on
   HTTP.  Another possibility is to use a proxy that can convert any
   Multipath TCP connection into a regular TCP connection.  MPTCP-
   specific proxies have been proposed [HotMiddlebox13b] [HAMPEL].

   Another possibility leverages the SOCKS protocol [RFC1928].  SOCKS is
   often used in enterprise networks to allow clients to reach external
   servers.  For this, the client opens a TCP connection to the SOCKS
   server that relays it to the final destination.  If both the client
   and the SOCKS server use Multipath TCP, but not the final
   destination, then Multipath TCP can still be used on the path between
   the clients and the SOCKS server.  At IETF 93, Korea Telecom
   announced that they have deployed (in June 2015) a commercial service
   that uses Multipath TCP on smartphones.  These smartphones access
   regular TCP servers through a SOCKS proxy.  This enables them to
   achieve throughputs of up to 850 Mbps [KT].
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   Measurements performed with Android smartphones [Mobicom15] show that
   popular applications work correctly through a SOCKS proxy and MPTCP-
   enabled smartphones.  Thanks to Multipath TCP, long-lived connections
   can be spread over the two available interfaces.  However, for short-
   lived connections, most of the data is sent over the initial subflow
   that is created over the interface corresponding to the default route
   and the second subflow is almost not used [PAM2016].

   A second use case is when Multipath TCP is used by middleboxes,
   typically inside access networks.  Various network operators are
   discussing and evaluating solutions for hybrid access networks
   [TR-348].  Such networks arise when a network operator controls two
   different access network technologies, e.g., wired and cellular, and
   wants to combine them to improve the bandwidth offered to the end
   users [HYA-ARCH].  Several solutions are currently investigated for
   such networks [TR-348].  Figure 2 shows the organization of such a
   network.  When a client creates a normal TCP connection, it is
   intercepted by the Hybrid CPE (HPCE) that converts it in a Multipath
   TCP connection so that it can use the available access networks (DSL
   and LTE in the example).  The Hybrid Access Gateway (HAG) does the
   opposite to ensure that the regular server sees a normal TCP
   connection.  Some of the solutions currently discussed for hybrid
   networks use Multipath TCP on the HCPE and the HAG.  Other solutions
   rely on tunnels between the HCPE and the HAG [GRE-NOTIFY].

   client --- HCPE ------ DSL ------- HAG --- internet --- server
               |                       |
               +------- LTE -----------+

                      Figure 2: Hybrid Access Network

3.  Operational Experience

3.1.  Middlebox Interference

   The interference caused by various types of middleboxes has been an
   important concern during the design of the Multipath TCP protocol.
   Three studies on the interactions between Multipath TCP and
   middleboxes are worth discussing.
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   The first analysis appears in [IMC11].  This paper was the main
   motivation for Multipath TCP incorporating various techniques to cope
   with middlebox interference.  More specifically, Multipath TCP has
   been designed to cope with middleboxes that:

   o  change source or destination addresses

   o  change source or destination port numbers

   o  change TCP sequence numbers

   o  split or coalesce segments

   o  remove TCP options

   o  modify the payload of TCP segments

   These middlebox interferences have all been included in the MBtest
   suite [MBTest].  This test suite is used in [HotMiddlebox13] to
   verify the reaction of the Multipath TCP implementation in the Linux
   kernel [MultipathTCP-Linux] when faced with middlebox interference.
   The test environment used for this evaluation is a dual-homed client
   connected to a single-homed server.  The middlebox behavior can be
   activated on any of the paths.  The main results of this analysis
   are:

   o  the Multipath TCP implementation in the Linux kernel is not
      affected by a middlebox that performs NAT or modifies TCP sequence
      numbers

   o  when a middlebox removes the MP_CAPABLE option from the initial
      SYN segment, the Multipath TCP implementation in the Linux kernel
      falls back correctly to regular TCP

   o  when a middlebox removes the DSS option from all data segments,
      the Multipath TCP implementation in the Linux kernel falls back
      correctly to regular TCP

   o  when a middlebox performs segment coalescing, the Multipath TCP
      implementation in the Linux kernel is still able to accurately
      extract the data corresponding to the indicated mapping

   o  when a middlebox performs segment splitting, the Multipath TCP
      implementation in the Linux kernel correctly reassembles the data
      corresponding to the indicated mapping.  [HotMiddlebox13] shows,
      in Figure 4 in Section 3.3, a corner case with segment splitting
      that may lead to a desynchronization between the two hosts.
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   The interactions between Multipath TCP and real deployed middleboxes
   are also analyzed in [HotMiddlebox13]; a particular scenario with the
   FTP Application Level Gateway running on a NAT is described.

   Middlebox interference can also be detected by analyzing packet
   traces on MPTCP-enabled servers.  A closer look at the packets
   received on the multipath-tcp.org server [TMA2015] shows that among
   the 184,000 Multipath TCP connections, only 125 of them were falling
   back to regular TCP.  These connections originated from 28 different
   client IP addresses.  These include 91 HTTP connections and 34 FTP
   connections.  The FTP interference is expected since Application
   Level Gateways used for FTP modify the TCP payload and the DSS
   Checksum detects these modifications.  The HTTP interference appeared
   only on the direction from server to client and could have been
   caused by transparent proxies deployed in cellular or enterprise
   networks.  A longer trace is discussed in [COMCOM2016] and similar
   conclusions about the middlebox interference are provided.

   From an operational viewpoint, knowing that Multipath TCP can cope
   with various types of middlebox interference is important.  However,
   there are situations where the network operators need to gather
   information about where a particular middlebox interference occurs.
   The tracebox software [tracebox] described in [IMC13a] is an
   extension of the popular traceroute software that enables network
   operators to check at which hop a particular field of the TCP header
   (including options) is modified.  It has been used by several network
   operators to debug various middlebox interference problems.
   Experience with tracebox indicates that supporting the ICMP extension
   defined in [RFC1812] makes it easier to debug middlebox problems in
   IPv4 networks.

   Users of the Multipath TCP implementation have reported some
   experience with middlebox interference.  The strangest scenario has
   been a middlebox that accepts the Multipath TCP options in the SYN
   segment but later replaces Multipath TCP options with a TCP EOL
   option [StrangeMbox].  This causes Multipath TCP to perform a
   fallback to regular TCP without any impact on the application.

3.2.  Congestion Control

   Congestion control has been an important challenge for Multipath TCP.
   The coupled congestion-control scheme defined in [RFC6356] in an
   adaptation of the NewReno algorithm.  A detailed description of this
   coupled algorithm is provided in [NSDI11].  It is the default scheme
   in the Linux implementation of Multipath TCP, but Linux supports
   other schemes.
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   The second congestion-control scheme is OLIA [CONEXT12].  It is also
   an adaptation of the NewReno single path congestion-control scheme to
   support multiple paths.  Simulations [CONEXT12] and measurements
   [CONEXT13] have shown that it provides some performance benefits
   compared to the default coupled congestion-control scheme.

   The delay-based scheme proposed in [ICNP12] has also been ported to
   the Multipath TCP implementation in the Linux kernel.  It has been
   evaluated by using simulations [ICNP12] and measurements [PaaschPhD].

   BALIA, defined in [BALIA], provides a better balance between TCP
   friendliness, responsiveness, and window oscillation.

   These different congestion-control schemes have been compared in
   several articles.  [CONEXT13] and [PaaschPhD] compare these
   algorithms in an emulated environment.  The evaluation showed that
   the delay-based congestion-control scheme is less able to efficiently
   use the available links than the three other schemes.

3.3.  Subflow Management

   The multipath capability of Multipath TCP comes from the utilization
   of one subflow per path.  The Multipath TCP architecture [RFC6182]
   and the protocol specification [RFC6824] define the basic usage of
   the subflows and the protocol mechanisms that are required to create
   and terminate them.  However, there are no guidelines on how subflows
   are used during the lifetime of a Multipath TCP session.  Most of the
   published experiments with Multipath TCP have been performed in
   controlled environments.  Still, based on the experience running them
   and discussions on the mptcp-dev mailing list, interesting lessons
   have been learned about the management of these subflows.

   From a subflow viewpoint, the Multipath TCP protocol is completely
   symmetrical.  Both the clients and the server have the capability to
   create subflows.  However, in practice, the existing Multipath TCP
   implementations have opted for a strategy where only the client
   creates new subflows.  The main motivation for this strategy is that
   often the client resides behind a NAT or a firewall, preventing
   passive subflow openings on the client.  Although there are
   environments such as datacenters where this problem does not occur,
   as of this writing, no precise requirement has emerged for allowing
   the server to create new subflows.
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3.4.  Implemented Subflow Managers

   The Multipath TCP implementation in the Linux kernel includes several
   strategies to manage the subflows that compose a Multipath TCP
   session.  The basic subflow manager is the full-mesh.  As the name
   implies, it creates a full-mesh of subflows between the communicating
   hosts.

   The most frequent use case for this subflow manager is a multihomed
   client connected to a single-homed server.  In this case, one subflow
   is created for each interface on the client.  The current
   implementation of the full-mesh subflow manager is static.  The
   subflows are created immediately after the creation of the initial
   subflow.  If one subflow fails during the lifetime of the Multipath
   TCP session (e.g., due to excessive retransmissions or the loss of
   the corresponding interface), it is not always reestablished.  There
   is ongoing work to enhance the full-mesh path manager to deal with
   such events.

   When the server is multihomed, using the full-mesh subflow manager
   may lead to a large number of subflows being established.  For
   example, consider a dual-homed client connected to a server with
   three interfaces.  In this case, even if the subflows are only
   created by the client, six subflows will be established.  This may be
   excessive in some environments, in particular when the client and/or
   the server have a large number of interfaces.  Implementations should
   limit the number of subflows that are used.

   Creating subflows between multihomed clients and servers may
   sometimes lead to operational issues as observed by discussions on
   the mptcp-dev mailing list.  In some cases, the network operators
   would like to have a better control on how the subflows are created
   by Multipath TCP [MPTCP-MAX-SUB].  This might require the definition
   of policy rules to control the operation of the subflow manager.  The
   two scenarios below illustrate some of these requirements.

                host1 ----------  switch1 ----- host2
                  |                   |            |
                  +--------------  switch2 --------+

                Figure 3: Simple Switched Network Topology
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   Consider the simple network topology shown in Figure 3.  From an
   operational viewpoint, a network operator could want to create two
   subflows between the communicating hosts.  From a bandwidth
   utilization viewpoint, the most natural paths are host1-switch1-host2
   and host1-switch2-host2.  However, a Multipath TCP implementation
   running on these two hosts may sometimes have difficulties to obtain
   this result.

   To understand the difficulty, let us consider different allocation
   strategies for the IP addresses.  A first strategy is to assign two
   subnets: subnetA (resp. subnetB) contains the IP addresses of host1's
   interface to switch1 (resp. switch2) and host2's interface to switch1
   (resp. switch2).  In this case, a Multipath TCP subflow manager
   should only create one subflow per subnet.  To enforce the
   utilization of these paths, the network operator would have to
   specify a policy that prefers the subflows in the same subnet over
   subflows between addresses in different subnets.  It should be noted
   that the policy should probably also specify how the subflow manager
   should react when an interface or subflow fails.

   A second strategy is to use a single subnet for all IP addresses.  In
   this case, it becomes more difficult to specify a policy that
   indicates which subflows should be established.

   The second subflow manager that is currently supported by the
   Multipath TCP implementation in the Linux kernel is the ndiffport
   subflow manager.  This manager was initially created to exploit the
   path diversity that exists between single-homed hosts due to the
   utilization of flow-based load-balancing techniques [SIGCOMM11].
   This subflow manager creates N subflows between the same pair of IP
   addresses.  The N subflows are created by the client and differ only
   in the source port selected by the client.  It was not designed to be
   used on multihomed hosts.

   A more flexible subflow manager has been proposed, implemented and
   evaluated in [CONEXT15].  This subflow manager exposes various kernel
   events to a user space daemon that decides when subflows need to be
   created and terminated based on various policies.
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3.5.  Subflow Destination Port

   The Multipath TCP protocol relies on the token contained in the
   MP_JOIN option to associate a subflow to an existing Multipath TCP
   session.  This implies that there is no restriction on the source
   address, destination address and source or destination ports used for
   the new subflow.  The ability to use different source and destination
   addresses is key to support multihomed servers and clients.  The
   ability to use different destination port numbers is worth discussing
   because it has operational implications.

   For illustration, consider a dual-homed client that creates a second
   subflow to reach a single-homed server as illustrated in Figure 4.

           client ------- r1 --- internet --- server
               |                   |
               +----------r2-------+


       Figure 4: Multihomed-Client Connected to Single-Homed Server

   When the Multipath TCP implementation in the Linux kernel creates the
   second subflow, it uses the same destination port as the initial
   subflow.  This choice is motivated by the fact that the server might
   be protected by a firewall and only accept TCP connections (including
   subflows) on the official port number.  Using the same destination
   port for all subflows is also useful for operators that rely on the
   port numbers to track application usage in their network.

   There have been suggestions from Multipath TCP users to modify the
   implementation to allow the client to use different destination ports
   to reach the server.  This suggestion seems mainly motivated by
   traffic-shaping middleboxes that are used in some wireless networks.
   In networks where different shaping rates are associated with
   different destination port numbers, this could allow Multipath TCP to
   reach a higher performance.  This behavior is valid according to the
   Multipath TCP specification [RFC6824].  An application could use an
   enhanced socket API [SOCKET] to behave in this way.

   However, from an implementation point-of-view supporting different
   destination ports for the same Multipath TCP connection can cause
   some issues.  A legacy implementation of a TCP stack creates a
   listening socket to react upon incoming SYN segments.  The listening
   socket is handling the SYN segments that are sent on a specific port
   number.  Demultiplexing incoming segments can thus be done solely by
   looking at the IP addresses and the port numbers.  With Multipath TCP
   however, incoming SYN segments may have an MP_JOIN option with a
   different destination port.  This means that all incoming segments
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   that did not match on an existing listening-socket or an already
   established socket must be parsed for an eventual MP_JOIN option.
   This imposes an additional cost on servers, previously not existent
   on legacy TCP implementations.

3.6.  Closing Subflows

                    client                       server
                       |                           |
   MPTCP: ESTABLISHED  |                           | MPTCP: ESTABLISHED
   Sub: ESTABLISHED    |                           | Sub: ESTABLISHED
                       |                           |
                       |         DATA_FIN          |
   MPTCP: CLOSE-WAIT   | <------------------------ | close()   (step 1)
   Sub: ESTABLISHED    |         DATA_ACK          |
                       | ------------------------> | MPTCP: FIN-WAIT-2
                       |                           | Sub: ESTABLISHED
                       |                           |
                       |  DATA_FIN + subflow-FIN   |
   close()/shutdown()  | ------------------------> | MPTCP: TIME-WAIT
   (step 2)            |        DATA_ACK           | Sub: CLOSE-WAIT
   MPTCP: CLOSED       | <------------------------ |
   Sub: FIN-WAIT-2     |                           |
                       |                           |
                       |        subflow-FIN        |
   MPTCP: CLOSED       | <------------------------ | subflow-close()
   Sub: TIME-WAIT      |        subflow-ACK        |
   (step 3)            | ------------------------> | MPTCP: TIME-WAIT
                       |                           | Sub: CLOSED
                       |                           |


    Figure 5: Multipath TCP may not be able to avoid time-wait state on
    the subflow (indicated as Sub in the drawing), even if enforced by
                    the application on the client-side.

   Figure 5 shows a very particular issue within Multipath TCP.  Many
   high-performance applications try to avoid TIME-WAIT state by
   deferring the closure of the connection until the peer has sent a
   FIN.  That way, the client on the left of Figure 5 does a passive
   closure of the connection, transitioning from CLOSE-WAIT to Last-ACK
   and finally freeing the resources after reception of the ACK of the
   FIN.  An application running on top of an MPTCP-enabled Linux kernel
   might also use this approach.  The difference here is that the
   close() of the connection (step 1 in Figure 5) only triggers the
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   sending of a DATA_FIN.  Nothing guarantees that the kernel is ready
   to combine the DATA_FIN with a subflow-FIN.  The reception of the
   DATA_FIN will make the application trigger the closure of the
   connection (step 2), trying to avoid TIME-WAIT state with this late
   closure.  This time, the kernel might decide to combine the DATA_FIN
   with a subflow-FIN.  This decision will be fatal, as the subflow's
   state machine will not transition from CLOSE_WAIT to Last-ACK, but
   rather go through FIN_WAIT-2 into TIME-WAIT state.  The TIME-WAIT
   state will consume resources on the host for at least 2 MSL (Maximum
   Segment Lifetime).  Thus, a smart application that tries to avoid
   TIME-WAIT state by doing late closure of the connection actually ends
   up with one of its subflows in TIME-WAIT state.  A high-performance
   Multipath TCP kernel implementation should honor the desire of the
   application to do passive closure of the connection and successfully
   avoid TIME-WAIT state -- even on the subflows.

   The solution to this problem lies in an optimistic assumption that a
   host doing active-closure of a Multipath TCP connection by sending a
   DATA_FIN will soon also send a FIN on all its subflows.  Thus, the
   passive closer of the connection can simply wait for the peer to send
   exactly this FIN -- enforcing passive closure even on the subflows.
   Of course, to avoid consuming resources indefinitely, a timer must
   limit the time our implementation waits for the FIN.

3.7.  Packet Schedulers

   In a Multipath TCP implementation, the packet scheduler is the
   algorithm that is executed when transmitting each packet to decide on
   which subflow it needs to be transmitted.  The packet scheduler
   itself does not have any impact on the interoperability of Multipath
   TCP implementations.  However, it may clearly impact the performance
   of Multipath TCP sessions.  The Multipath TCP implementation in the
   Linux kernel supports a pluggable architecture for the packet
   scheduler [PaaschPhD].  As of this writing, two schedulers have been
   implemented: round-robin and lowest-rtt-first.  The second scheduler
   relies on the round-trip time (rtt) measured on each TCP subflow and
   sends first segments over the subflow having the lowest round-trip
   time.  They are compared in [CSWS14].  The experiments and
   measurements described in [CSWS14] show that the lowest-rtt-first
   scheduler appears to be the best compromise from a performance
   viewpoint.  Another study of the packet schedulers is presented in
   [PAMS2014].  This study relies on simulations with the Multipath TCP
   implementation in the Linux kernel.  They compare the lowest-rtt-
   first with the round-robin and a random scheduler.  They show some
   situations where the lowest-rtt-first scheduler does not perform as
   well as the other schedulers, but there are many scenarios where the
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   opposite is true.  [PAMS2014] notes that "it is highly likely that
   the optimal scheduling strategy depends on the characteristics of the
   paths being used."

3.8.  Segment Size Selection

   When an application performs a write/send system call, the kernel
   allocates a packet buffer (sk_buff in Linux) to store the data the
   application wants to send.  The kernel will store at most one MSS
   (Maximum Segment Size) of data per buffer.  As the MSS can differ
   amongst subflows, an MPTCP implementation must select carefully the
   MSS used to generate application data.  The Linux kernel
   implementation had various ways of selecting the MSS: minimum or
   maximum amongst the different subflows.  However, these heuristics of
   MSS selection can cause significant performance issues in some
   environments.  Consider the following example.  An MPTCP connection
   has two established subflows that respectively use an MSS of 1420 and
   1428 bytes.  If MPTCP selects the maximum, then the application will
   generate segments of 1428 bytes of data.  An MPTCP implementation
   will have to split the segment in two (1420-byte and 8-byte) segments
   when pushing on the subflow with the smallest MSS.  The latter
   segment will introduce a large overhead as this single data segment
   will use 2 slots in the congestion window (in packets) therefore
   reducing by roughly twice the potential throughput (in bytes/s) of
   this subflow.  Taking the smallest MSS does not solve the issue as
   there might be a case where the subflow with the smallest MSS only
   sends a few packets, therefore reducing the potential throughput of
   the other subflows.

   The Linux implementation recently took another approach [DetalMSS].
   Instead of selecting the minimum and maximum values, it now
   dynamically adapts the MSS based on the contribution of all the
   subflows to the connection's throughput.  For each subflow, it
   computes the potential throughput achieved by selecting each MSS
   value and by taking into account the lost space in the congestion
   window.  It then selects the MSS that allows to achieve the highest
   potential throughput.

   Given the prevalence of middleboxes that clamp the MSS, Multipath TCP
   implementations must be able to efficiently support subflows with
   different MSS values.  The strategy described above is a possible
   solution to this problem.
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3.9.  Interactions with the Domain Name System

   Multihomed clients such as smartphones can send DNS queries over any
   of their interfaces.  When a single-homed client performs a DNS
   query, it receives from its local resolver the best answer for its
   request.  If the client is multihomed, the answer in response to the
   DNS query may vary with the interface over which it has been sent.

                      cdn1
                       |
           client -- cellular -- internet -- cdn3
              |                   |
              +----- wifi --------+
                       |
                     cdn2


                     Figure 6: Simple Network Topology

   If the client sends a DNS query over the WiFi interface, the answer
   will point to the cdn2 server while the same request sent over the
   cellular interface will point to the cdn1 server.  This might cause
   problems for CDN providers that locate their servers inside ISP
   networks and have contracts that specify that the CDN server will
   only be accessed from within this particular ISP.  Assume now that
   both the client and the CDN servers support Multipath TCP.  In this
   case, a Multipath TCP session from cdn1 or cdn2 would potentially use
   both the cellular network and the WiFi network.  Serving the client
   from cdn2 over the cellular interface could violate the contract
   between the CDN provider and the network operators.  A similar
   problem occurs with regular TCP if the client caches DNS replies.
   For example, the client obtains a DNS answer over the cellular
   interface and then stops this interface and starts to use its WiFi
   interface.  If the client retrieves data from cdn1 over its WiFi
   interface, this may also violate the contract between the CDN and the
   network operators.

   A possible solution to prevent this problem would be to modify the
   DNS resolution on the client.  The client subnet Extension Mechanisms
   for DNS (EDNS) defined in [RFC7871] could be used for this purpose.
   When the client sends a DNS query from its WiFi interface, it should
   also send the client subnet corresponding to the cellular interface
   in this request.  This would indicate to the resolver that the answer
   should be valid for both the WiFi and the cellular interfaces (e.g.,
   the cdn3 server).
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3.10.  Captive Portals

   Multipath TCP enables a host to use different interfaces to reach a
   server.  In theory, this should ensure connectivity when at least one
   of the interfaces is active.  However, in practice, there are some
   particular scenarios with captive portals that may cause operational
   problems.  The reference environment is shown in Figure 7.

           client -----  network1
                |
                +------- internet ------------- server

                    Figure 7: Issue with Captive Portal

   The client is attached to two networks: network1 that provides
   limited connectivity and the entire Internet through the second
   network interface.  In practice, this scenario corresponds to an open
   WiFi network with a captive portal for network1 and a cellular
   service for the second interface.  On many smartphones, the WiFi
   interface is preferred over the cellular interface.  If the
   smartphone learns a default route via both interfaces, it will
   typically prefer to use the WiFi interface to send its DNS request
   and create the first subflow.  This is not optimal with Multipath
   TCP.  A better approach would probably be to try a few attempts on
   the WiFi interface and then, upon failure of these attempts, try to
   use the second interface for the initial subflow as well.

3.11.  Stateless Webservers

   MPTCP has been designed to interoperate with webservers that benefit
   from SYN-cookies to protect against SYN-flooding attacks [RFC4987].
   MPTCP achieves this by echoing the keys negotiated during the
   MP_CAPABLE handshake in the third ACK of the three-way handshake.
   Reception of this third ACK then allows the server to reconstruct the
   state specific to MPTCP.

   However, one caveat to this mechanism is the unreliable nature of the
   third ACK.  Indeed, when the third ACK gets lost, the server will not
   be able to reconstruct the MPTCP state.  MPTCP will fall back to
   regular TCP in this case.  This is in contrast to regular TCP.  When
   the client starts sending data, the first data segment also includes
   the SYN-cookie, which allows the server to reconstruct the TCP-state.
   Further, this data segment will be retransmitted by the client in
   case it gets lost and thus is resilient against loss.  MPTCP does not
   include the keys in this data segment and thus the server cannot
   reconstruct the MPTCP state.
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   This issue might be considered as a minor one for MPTCP.  Losing the
   third ACK should only happen when packet loss is high; in this case,
   MPTCP provides a lot of benefits as it can move traffic away from the
   lossy link.  It is undesirable that MPTCP has a higher chance to fall
   back to regular TCP in those lossy environments.

   [MPTCP-DEPLOY] discusses this issue and suggests a modified handshake
   mechanism that ensures reliable delivery of the MP_CAPABLE, following
   the three-way handshake.  This modification will make MPTCP reliable,
   even in lossy environments when servers need to use SYN-cookies to
   protect against SYN-flooding attacks.

3.12.  Load-Balanced Server Farms

   Large-scale server farms typically deploy thousands of servers behind
   a single virtual IP (VIP).  Steering traffic to these servers is done
   through Layer 4 load-balancers that ensure that a TCP-flow will
   always be routed to the same server [Presto08].

   As Multipath TCP uses multiple different TCP subflows to steer the
   traffic across the different paths, load-balancers need to ensure
   that all these subflows are routed to the same server.  This implies
   that the load-balancers need to track the MPTCP-related state,
   allowing them to parse the token in the MP_JOIN and assign those
   subflows to the appropriate server.  However, server farms typically
   deploy several load-balancers for reliability and capacity reasons.
   As a TCP subflow might get routed to any of these load-balancers,
   they would need to synchronize the MPTCP-related state -- a solution
   that is not feasible on a large scale.

   The token (carried in the MP_JOIN) contains the information
   indicating to which MPTCP-session the subflow belongs.  As the token
   is a hash of the key, servers are not able to generate the token in
   such a way that the token can provide the necessary information to
   the load-balancers, which would allow them to route TCP subflows to
   the appropriate server.  [MPTCP-LOAD] discusses this issue in detail
   and suggests two alternative MP_CAPABLE handshakes to overcome these.

4.  Security Considerations

   This informational document discusses use cases and operational
   experience with Multipath TCP.  An extensive analysis of the
   remaining security issues in the Multipath TCP specification has been
   published in [RFC7430], together with suggestions for possible
   solutions.
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   From a security viewpoint, it is important to note that Multipath
   TCP, like other multipath solutions such as SCTP, has the ability to
   send packets belonging to a single connection over different paths.
   This design feature of Multipath TCP implies that middleboxes that
   have been deployed on-path assuming that they would observe all the
   packets exchanged for a given connection in both directions may not
   function correctly anymore.  A typical example are firewalls,
   Intrusion Detection System (IDS) or deep packet inspections (DPIs)
   deployed in enterprise networks.  Those devices expect to observe all
   the packets from all TCP connections.  With Multipath TCP, those
   middleboxes may not observe anymore all packets since some of them
   may follow a different path.  The two examples below illustrate
   typical deployments of such middleboxes.  The first example,
   Figure 8, shows an MPTCP-enabled smartphone attached to both an
   enterprise and a cellular network.  If a Multipath TCP connection is
   established by the smartphone towards a server, some of the packets
   sent by the smartphone or the server may be transmitted over the
   cellular network and thus be invisible for the enterprise middlebox.

     smartphone +----- enterprise net --- MBox----+------ server
                |                                 |
                +----- cellular net  -------------+

              Figure 8: Enterprise Middlebox May Not Observe
                     All Packets from Multihomed Host

   The second example, Figure 9, shows a possible issue when multiple
   middleboxes are deployed inside a network.  For simplicity, we assume
   that network1 is the default IPv4 path while network2 is the default
   IPv6 path.  A similar issue could occur with per-flow load-balancing
   such as ECMP [RFC2992].  With regular TCP, all packets from each
   connection would either pass through Mbox1 or Mbox2.  With Multipath
   TCP, the client can easily establish a subflow over network1 and
   another over network2 and each middlebox would only observe a part of
   the traffic of the end-to-end Multipath TCP connection.

     client ----R-- network1  --- MBox1 -----R------------- server
                |                            |
                +-- network2  --- MBox2 -----+

                      Figure 9: Interactions between
                  Load-Balancing and Security Middleboxes

   In these two cases, it is possible for an attacker to evade some
   security measures operating on the TCP byte stream and implemented on
   the middleboxes by controlling the bytes that are actually sent over
   each subflow and there are tools that ease those kinds of evasion
   [PZ15] [PT14].  This is not a security issue for Multipath TCP itself
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   since Multipath TCP behaves correctly.  However, this demonstrates
   the difficulty of enforcing security policies by relying only on
   on-path middleboxes instead of enforcing them directly on the
   endpoints.
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